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The rise of Uber as the global alternative taxi operator 
has attracted a lot of interest recently. Aside from the 
media headlines which discuss the new phenomenon, e.g. 
on how it has disrupted the traditional transportation 
industry, policy makers, economists, citizens and scien¬ 
tists have engaged in a discussion that is centred around 
the means to integrate the new generation of the sharing 
economy services in urban ecosystems. In this work, we 
aim to shed new light on the discussion, by taking ad¬ 
vantage of a publicly available longitudinal dataset that 
describes the mobility of yellow taxis in New York City. 
In addition to movement, this data contains information 
on the fares paid by the taxi customers for each trip. As a 
result we are given the opportunity to provide a first head 
to head comparison between the iconic yellow taxi and its 
modern competitor, Uber, in one of the world’s largest 
metropolitan centres. We identify situations when Uber 
X, the cheapest version of the Uber taxi service, tends to 
be more expensive than yellow taxis for the same journey. 
We also demonstrate how Uber’s economic model effec¬ 
tively takes advantage of well known patterns in human 
movement. Finally, we take our analysis a step further 
by proposing a new mobile application that compares 
taxi prices in the city to facilitate traveller’s taxi choices, 
hoping to ultimately to lead to a reduction of commuter 
costs. Our study provides a case on how big datasets that 
become public can improve urban services for consumers 
by offering the opportunity for transparency in economic 
sectors that lack up to date regulations. 

I. TAXI PRICE COMPARISON EXPERIMENT 

The New York City Taxi Dataset . The Freedom 
of Information Law in United States encourages public 
authorities to release their data where appropriate to the 
benefit of the citizens. In 2014 the law was exploited 
by Chris Whong to acquire and post on the web one 
of the most comprehensive taxi mobility datasets avail¬ 
able today. The dataset describes taxi journeys in New 
York City during the full course of 2013, and informs 
us not only on the origin and destination points of taxi 
trips, noted in the related jargon as pick up and drop 
off points respectively, but also on the financial costs in¬ 
curred to the customer (trip fair) with unprecedented de¬ 
tail. This rather dense mobility dataset, containing hun¬ 
dreds of millions of trips is of gigabytes in size and can be 


downloaded here 

http://chriswhong.com/open-data/ 

foil_nyc_taxi/. 

A sample of the traces generated by 


the data can be seen is in Figure [lj where we have drawn 



FIG. 1: Marking the traces of new york city yellow taxis. For 
every pick up and drop off point in a uniform sample of the 
data we draw a black point. 

a black point for every pick up and drop off point of a 
taxi journey. 

Comparing Taxi Prices In August 2014, Uber 
opened up an API with access to valuable information 
about its services. The occasion allowed us to perform a 
first head to head comparative analysis of prices between 
Uber and Yellow taxis in New York City. To achieve this 
we run the following experiment : 

• 1. For every trip in the New York City Yellow Taxi 
dataset, record the geographic coordinates (lati¬ 
tude and longitude) of the pick up and drop off 
points. 

• 2. Retrieve the total fare paid by the customer for 
the trip (including the tip). 

• 3. Query Uber’s API and ask how much they would 
charge for the same trip (same pick up and drop 
off points), considering the cheapest version of the 
service, Uber X. 

• 4. Uber’s API returns a value range indicating the 
minimum and maximum price estimate. We take 
the mean of the two values. 

• 5. We then compare the prices from the two ser¬ 
vices. 

As can be observed in Figure [2] where the distribu¬ 
tion of prices for the two services is shown, despite the 
qualitative similarity of the two distribution, yellow taxi 
appear on average (median) 1.4 US dollars cheaper than 
Uber X. In Figure [3j we compare Uber and yellow taxis 
from another perspective: for every observed yellow taxi 
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FIG. 2: Distribution of prices per journey for Uber X and 
Yellow Taxis in New York City. 



FIG. 3: Median Uber price for a given Yellow Taxi price. 

price, we show the median Uber X price. Uber appears 
more expensive for prices below 35 dollars and begins to 
become cheaper only after that threshold. As one would 
expect, the cheaper journeys are those that are in princi¬ 
ple of shorter range. As observed in a variety of empirical 
data, human mobility tends to be characterised by a vast 
majority of short trips mm- This observation therefore 
suggests that Uber’s economical model exploits this trend 
of human mobility in order to maximise revenue. We also 
confirm the skewed frequency distribution of movement 
distances in the present context by visualising it in Fig¬ 
ure [4j where we note a mean distance for a yellow taxi 
trip in New York equal to 2.09. 

The above experiment may involve a number of biases 
which we refer to here. The NYC Yellow taxi data cor¬ 
responded to year 2013 whereas Uber to 2014. Although 
note that the prices for yellow taxis in the city had last 
changed in 2012 after 8 years [3 . So it should offer a 
good approximation of todays prices. Further, there was 
no control for time of the day/week for the API query, 
an additional dimension which should be incorporated 
when available. However, we argue that the process of 
comparing two different companies that provide the same 
service in the same geographic area is of value to com¬ 
muters. Just as consumer have open access to airfares 
for a long time now allowing for transparency in a free, 
competitive, market we believe that similar approaches 
could benefit commuters in modern cities. 


FIG. 4: Distribution of geographic distances between drop off 
and pick up points for Yellow Taxi journeys. 


Yellow Taxis VS Uber - Price Comparison 



FIG. 5: Geographic comparison between Uber and Yellow 
Taxi prices. We paint an area black if Uber is cheaper by trip 
majority and yellow otherwise. 


II. HELPING COMMUTERS 

Our observations show that it might be financially ad¬ 
vantageous on average for travellers to chose either Yel¬ 
low Cabs or Uber depending on the duration of their 
journey. However the specific journey they are willing 
to take matters. In order to help users to take the right 
decision, we have developed a smartphone app, called 
OpenStreetCab, designed as follows. 

One limitation for the design of our service is that only 
prices for trips with origins and destinations in the New 
York City Taxi Dataset can in principle be retrieved. In 
order to evaluate the price of any trip, as needed for a 
usable App, we have divided the NY region into a mesh 
with cells of size around 100m by 100m in order to index 
trips in the database efficiently. For each user query, 
we find a set of trips in our dataset with the origin in 
neighbouring cells of desired origin and, among them, we 
find the trip whose destination is closest to the desired 
one. This strategy has the advantage of being sufficiently 
fast to perform online queries and expected to provide 
reliable price estimates. For the same trip, Uber price is 
obtained through their API. 

A real-time prototype has been designed and is cur¬ 
rently launched on popular mobile platforms. Future 
improvements include the possibility to change predic- 
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tions depending on the time of the day, or on the ex¬ 
pected traffic on the way, but also to suggest other types 
of transportations, such as walking when the distance is 
sufficiently short, or only part of the way, in situations 
when a small change in the origin point can lead to a sig¬ 
nificant change in the price quote. In the meanwhile the 
current version (Fig. |6| already provides a fully work¬ 
ing solution, including geolocation services and address 
retrieval. We are planning to launch the application on 
the related stores very soon. 
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FIG. 6: The proof of the concept 
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